Regret Minimization Under Partial Monitoring

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust approachability and regret minimization in games with partial monitoring

Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficie...

متن کامل

Regret Bounds and Minimax Policies under Partial Monitoring

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: pseudoregret, expected regret, high probability regret and tracking the best expert regret. We introduce a new forecaster, INF (Implicitly Normalized Forecaster) based on an arbitrary function ψ for which we propose a u...

متن کامل

Partial Monitoring - Classification, Regret Bounds, and Algorithms

In a partial monitoring game, the learner repeatedly chooses an action, the environment responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his regret, which is the difference between his total cumulative loss and the total loss of the best fixed acti...

متن کامل

Multi-Agent Counterfactual Regret Minimization for Partial-Information Collaborative Games

We study the generalization of counterfactual regret minimization (CFR) to partialinformation collaborative games with more than 2 players. For instance, many 4-player card games are structured as 2v2 games, with each player only knowing the contents of their own hand. To study this setting, we propose a multi-agent collaborative version of Kuhn Poker. We observe that a straightforward applicat...

متن کامل

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

This paper examines the problem of multi-agent learning in N -person non-cooperative games. For concreteness, we focus on the socalled “hedge” variant of the exponential weights (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that a) dominated strategies become extinct (a.s.); and b) in generic g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics of Operations Research

سال: 2006

ISSN: 0364-765X,1526-5471

DOI: 10.1287/moor.1060.0206